test: stop leaked-spy cascades across the whole test runner by ndycode · Pull Request #590 · ndycode/codex-multi-auth

ndycode · 2026-06-11T08:01:57Z

Summary

Isolation hardening for the whole test runner, fixing a cross-test contamination class discovered while writing refactor: migrate the last two hand-rolled retry loops to withRetry #585's rename-retry tests: a test that fails before its inline mockRestore() leaks its spy into every later test of the file.

What Changed

The failure mechanics, observed live in sandboxed environments where the EACCES-baseline tests die mid-run:

An early test spies a method (e.g. fs.rename) and crashes before reaching its mockRestore() (no try/finally).
A later test calls vi.spyOn on the same method — vitest returns the same leaked spy rather than wrapping again.
That test's "original" passthrough binding (fs.rename.bind(fs)) therefore captures the spy itself, and calling through recurses into the test's own mock — a RangeError: Maximum call stack size exceeded crash with a confusing stack.

Two layers land here:

test/storage.test.ts (first commit): the file's top-level afterEach restored env vars but never mocks; it now calls vi.restoreAllMocks(), with a comment documenting the mechanism and (per review) its relationship to the global hook. The storage suite is where the cascade was observed.
test/helpers/global-sandbox.ts (second commit): a repo scan found 16 more test files with the same shape (bare mockRestore() at the end of a test body, no global restore hook). Rather than patching each, the shared setup file — already loaded by every suite — registers one global afterEach(() => vi.restoreAllMocks()), containing the cascade everywhere. No suite creates spies in beforeAll (verified by scan), so an after-each restore cannot strand intentional cross-test spies.

No assertions change anywhere; the per-test mockRestore() calls remain as harmless redundancy.

Validation

npm run typecheck + npx eslint on both files (pre-commit hook re-runs both)
npm test -- test/storage.test.ts with the file-level hook — failure names a strict subset (19) of the 23-name environment baseline block, zero new names
Scan proof for the global hook: beforeAll + spyOn co-occurrence search over test/ returns nothing, so no suite relies on a spy surviving past a test
Full-suite run with the global hook (all 317 files, --maxWorkers=1): 54 failing names, all in the 58-name docs/audits/evidence/test-baseline-2026-06-10.txt environment baseline, zero new names — and four baseline names now PASS (the storage rotation/backup/artifact tests that were cascade victims, not genuine environment failures). Strict improvement; full name-diff posted as a PR comment.
npm run build deferred to CI

Docs and Governance Checklist

No user-visible behavior changed; test isolation only

Risk and Rollback

Risk level: low — vi.restoreAllMocks() only affects vi.spyOn spies, which every suite already restores per-test on the success path; the hooks merely cover the failure path. The one theoretical hazard (a suite depending on a beforeAll spy across tests) is excluded by scan.
Rollback plan: revert either commit independently; they are belt-and-braces layers of the same defense.

https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

A test in this file that fails before its inline mockRestore() leaks its fs spy into every later test: a later vi.spyOn on the same method returns the SAME leaked spy, so a passthrough binding captured from it recurses into the new test's own mock (a Maximum call stack crash observed in sandboxed environments where the EACCES baseline tests die mid-run). The top-level afterEach restored env vars but never mocks; vi.restoreAllMocks() there guarantees one failure cannot cascade. No assertion changes; the per-test mockRestore() calls remain as harmless redundancy. https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

chatgpt-codex-connector · 2026-06-11T08:02:03Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

coderabbitai · 2026-06-11T08:02:05Z

📝 Walkthrough

Walkthrough

Test infrastructure adds global and per-test-file afterEach hooks that call vi.restoreAllMocks() after each test to prevent leaked mock spies from affecting later test runs.

Changes

Vitest Mock Spy Cleanup

Layer / File(s)	Summary
Mock restoration in afterEach hooks `test/helpers/global-sandbox.ts`, `test/storage.test.ts`	Global sandbox helper registers a suite-level `afterEach` hook that restores all mocks, and `test/storage.test.ts` adds explicit unconditional `vi.restoreAllMocks()` in its own `afterEach` cleanup to prevent spy leakage between tests.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related PRs

ndycode/codex-multi-auth#585: Both PRs touch test/storage.test.ts to add/guard vi.restoreAllMocks() cleanup to prevent leaked fs.rename spies from affecting later tests.
ndycode/codex-multi-auth#321: Both PRs adjust Vitest mock/spies lifecycle to prevent cross-test leakage (main adds vi.restoreAllMocks() in afterEach, retrieved strengthens CLI test isolation by fully resetting/hardening mocks before each test).

Suggested labels

bug

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	title follows conventional commits format with 'test' type, includes scope context about spy cascades, and is 59 chars under the 72-char limit.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	pr description is thorough and addresses all template sections with clear technical detail on the cross-test contamination issue and validation evidence.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/audit-71-storage-spy-hygiene

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch claude/audit-71-storage-spy-hygiene

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

Extends the storage-suite fix to the whole runner: a scan found 16 more test files with the same vulnerability (bare mockRestore() at the end of a test body, no global restore hook), where one mid-test failure leaks a spy that a later vi.spyOn on the same method returns verbatim, sending passthrough bindings into infinite recursion. No suite creates spies in beforeAll, so an after-each restoreAllMocks in the shared global-sandbox setup file is safe and contains the cascade everywhere. https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

docs: include token-refresh in the rotation helper cluster description

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/storage.test.ts`:
- Around line 56-60: The test file contains a redundant per-suite call to
vi.restoreAllMocks() that is already handled by the global afterEach hook in
test/helpers/global-sandbox.ts; remove the vi.restoreAllMocks() call from
storage.test.ts to avoid maintenance noise, or if you prefer to keep the
defensive call, add a one-line comment next to vi.restoreAllMocks() referencing
the global afterEach in test/helpers/global-sandbox.ts (and its line range) so
future maintainers understand it is redundant but intentional.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: be64b33f-4483-4090-ab76-e77572afabb2

📥 Commits

Reviewing files that changed from the base of the PR and between 1d6c3e0 and 7204596.

📒 Files selected for processing (2)

test/helpers/global-sandbox.ts
test/storage.test.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (8)

**/*.{ts,js,mts,cjs}

📄 CodeRabbit inference engine (AGENTS.md)

Use ESM only with "type": "module" in package.json; Node >= 18.17 required

Files:

test/helpers/global-sandbox.ts
test/storage.test.ts

**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

Do not use as any, @ts-ignore, or @ts-expect-error TypeScript assertions

Files:

test/helpers/global-sandbox.ts
test/storage.test.ts

**/*.{ts,js}

📄 CodeRabbit inference engine (AGENTS.md)

OAuth callback server uses port 1455; do not hardcode OAuth ports—use existing constants/helpers

Files:

test/helpers/global-sandbox.ts
test/storage.test.ts

**/*.{js,ts}

📄 CodeRabbit inference engine (README.md)

**/*.{js,ts}: Use JSON format for machine-readable output in diagnostic and reporting commands (status, check, report, monitor, why-selected)
Implement interactive terminal dashboard with hotkeys (Up/Down for navigation, Enter for selection, 1-9 for quick switch, / for search, ? for help, Q for back)
Support device authentication flow via --device-auth flag and manual OAuth callback paste fallback via --manual flag for headless environments
Run npm version check during normal forwarded Codex startup and print upgrade notices only on interactive TTY or when CODEX_MULTI_AUTH_DEBUG=1 is set

Files:

test/helpers/global-sandbox.ts
test/storage.test.ts

test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

test/helpers/global-sandbox.ts
test/storage.test.ts

**

⚙️ CodeRabbit configuration file

**: # PROJECT KNOWLEDGE BASE

Generated: 2026-04-25
Commit: a87e005
Validated: 2026-06-10 against commit 98d9819 (repo audit; claims re-checked against the tree, content not regenerated)
Branch: main
Package version: 2.3.0-beta.2

OVERVIEW

codex-multi-auth is a Codex CLI-first OAuth account manager and optional forwarding wrapper for the official Codex CLI. The installed codex-multi-auth entrypoint handles account-management commands locally, codex-multi-auth-codex forwards official Codex commands through this package's wrapper when explicitly used, and runtime rotation can route live Responses traffic through a localhost account-rotation proxy by default. The plugin-host entrypoint remains exported for compatibility, but the primary product surface is the account manager, optional wrapper, storage, runtime proxy, and repair tooling.

STRUCTURE
./
├── scripts/
│   ├── codex.js              # codex-multi-auth-codex wrapper, official CLI forwarder, shadow CODEX_HOME/runtime proxy setup
│   ├── codex-multi-auth.js   # standalone package CLI entrypoint
│   ├── codex-routing.js      # auth command and compatibility alias routing
│   ├── codex-bin-resolver.js # official Codex binary discovery
│   ├── codex-app-router.js   # persistent localhost router for packaged Codex app bind
│   └── codex-app-launcher.js # reversible user-level app launcher routing helper
├── index.ts                  # optional plugin-host runtime entry
├── lib/                      # core runtime logic (see lib/AGENTS.md)
│   ├── auth/                 # OAuth flow, PKCE, callback server
│   ├── runtime/              # Codex CLI/app integration helpers, app bind, live sync, runtime observability
│   ├── request/              # request transform, SSE, failover, backoff
│   ├── storage/              # path resolution, migrations, backups, restore, import/export
│   ├── codex-cli/            # Codex CLI state sync and writer helpers
│   ├── codex-manager/        # command modules and...

Files:

test/helpers/global-sandbox.ts
test/storage.test.ts

test/**/*.test.ts

📄 CodeRabbit inference engine (test/AGENTS.md)

test/**/*.test.ts: Write Vitest test suites with globals enabled (describe, it, expect)
Maintain 80%+ coverage threshold across statements, branches, functions, and lines
Use removeWithRetry() for Windows filesystem cleanup instead of bare fs.rm to handle EBUSY, EPERM, and ENOTEMPTY errors
Do not rely on dist/ in tests; use source files instead
Do not skip tests without justification
Relax lint rules for test files as configured in eslint.config.js

Files:

test/storage.test.ts

test/**/storage*.test.ts

📄 CodeRabbit inference engine (test/AGENTS.md)

Test V3 storage, worktree migration, and concurrent load scenarios in storage.test.ts and storage-async.test.ts

Files:

test/storage.test.ts

🧠 Learnings (2)

📚 Learning: 2026-06-04T06:14:18.093Z

Learnt from: ndycode
Repo: ndycode/codex-multi-auth PR: 510
File: test/scheduling-strategy-config.test.ts:1-1
Timestamp: 2026-06-04T06:14:18.093Z
Learning: In ndycode/codex-multi-auth, do not flag explicit imports from "vitest" (e.g., describe, it, expect, beforeEach/afterEach, etc.) in test files as issues—even if the Vitest config sets `globals: true`. The repo’s established convention is to keep these imports for consistency with neighboring tests; removing them would make files outliers.

Applied to files:

test/storage.test.ts

📚 Learning: 2026-06-04T06:14:24.975Z

Learnt from: ndycode
Repo: ndycode/codex-multi-auth PR: 510
File: test/runtime-rotation-proxy.test.ts:2478-2491
Timestamp: 2026-06-04T06:14:24.975Z
Learning: In ndycode/codex-multi-auth test files (e.g. `test/*.test.ts`), when creating V3 storage fixtures for accounts, it’s an intentional convention to use `as never` for deliberately minimal stored-account objects that only include `refreshToken`, `addedAt`, and `lastUsed`. Do not treat `as never` here as a type-safety problem: optional/other fields are expected to be populated by the runtime during execution, and the cast is used solely to keep the fixture minimal and consistent across existing tests.

Applied to files:

test/storage.test.ts

🔇 Additional comments (1)

test/helpers/global-sandbox.ts (1)

32-41: LGTM!

https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

ndycode · 2026-06-11T08:16:16Z

Full-suite validation of the global hook is in (npx vitest run --maxWorkers=1 on this branch, all 317 files):

54 failing test names, every one of them in the §7 environment baseline — zero new names. The suite-wide afterEach(vi.restoreAllMocks) changes no test outcome anywhere outside the fix's target.
Four baseline names now pass that previously failed: keeps rotating backup order deterministic across parallel saves, preserves historical backups when creating the latest backup fails, removes primary, backup, and wal artifacts, and rotates backups and retains historical snapshots. These are exactly the storage-suite cascade victims — they were never genuine environment failures, they were collateral of the spy leaked by the EACCES-failing test ahead of them. The baseline shrinks from 58 to 54.

So the change is a strict improvement: file-level failures stay at the known 8 environment files, and the only name-level deltas are recoveries. Validation evidence retained at /tmp/global-restore-run.txt name-diff against docs/audits/evidence/test-baseline-2026-06-10.txt.

Generated by Claude Code

ndycode pushed a commit that referenced this pull request Jun 11, 2026

docs: add the #590 spy-cascade isolation fix to the §5.2 delivery log

0a6b2e9

https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

ndycode mentioned this pull request Jun 11, 2026

docs: record the direct-coverage wave in the audit delivery log #568

Open

5 tasks

ndycode changed the title ~~test: restore all mocks in storage suite afterEach to stop spy cascades~~ test: stop leaked-spy cascades across the whole test runner Jun 11, 2026

ndycode added a commit that referenced this pull request Jun 11, 2026

Merge pull request #590 from ndycode/claude/audit-70-lib-docs-refresh

07f9adc

docs: include token-refresh in the rotation helper cluster description

coderabbitai Bot requested changes Jun 11, 2026

View reviewed changes

Comment thread test/storage.test.ts

test: note why the storage suite keeps its local restoreAllMocks

72e30ed

https://claude.ai/code/session_01XNtnkLbBiXZxfQQYLMpucB

ndycode mentioned this pull request Jun 11, 2026

docs: record the property-invariant wave in the audit delivery log #602

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: stop leaked-spy cascades across the whole test runner#590

test: stop leaked-spy cascades across the whole test runner#590
ndycode wants to merge 3 commits into
mainfrom
claude/audit-71-storage-spy-hygiene

ndycode commented Jun 11, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

coderabbitai Bot left a comment

OVERVIEW

STRUCTURE

Uh oh!

Uh oh!

ndycode commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ndycode commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Validation

Docs and Governance Checklist

Risk and Rollback

Uh oh!

chatgpt-codex-connector Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

OVERVIEW

STRUCTURE

Uh oh!

Uh oh!

ndycode commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ndycode commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading